13 research outputs found

    Combining Concept- with Content-based Multimedia Retrieval

    Get PDF
    The arrival of the XML standard opened new doors for structured document search. Common approach in XML retrieval is to directly exploit the documents structure. However this is likely to fail for two reasons. First of all, it neglects the rich multimedia character of documents on the Internet, where a wide variety of multimedia objects can be found such as text, images and streaming video. Secondly, using the document structure as the basis for searching the content of a document can easily lead to semantical misinterpretation of the document's content. This chapter discusses an approach for searching rich multimedia document collections, that tackles these two problems using a combination of conceptual search and content-based retrieval

    Combining Concept- with Content-Based Multimedia Retrieval

    Get PDF
    The arrival of the XML standard opened new doors for structured document search. Common approach in XML retrieval is to directly exploit the documents structure. However this is likely to fail for two reasons. First of all, it neglects the rich multimedia character of documents on the Internet, where a wide variety of multimedia objects can be found such as text, images and streaming video. Secondly, using the document structure as the basis for searching the content of a document can easily lead to semantical misinterpretation of the document's content. This chapter discusses an approach for searching rich multimedia document collections, that tackles these two problems using a combination of conceptual search and content-based retrieval

    Acoi: A System for Indexing Multimedia Objects

    Get PDF
    The explosion of the number of Web pages also leads to countless accessible multimedia objects. Their abundance makes the Internet an interesting application for multimedia retrieval systems. Many search engines are going about to supply some retrieval functionality for independent retrieval of these objects. However, most of these multimedia search engines aim at a fixed set of multimedia index attributes. The Acoi system provides an extensible framework for retrieving multimedia objects of any type on basis of their content, based on both low-level features and high-level concepts, and context

    Indexing real-world data using semi-structured documents

    Get PDF
    We address the problem of deriving meaningful semantic index information for a multi-media database using a semi-structured docu-ment model. We show how our framework, called {em feature grammars, can be used to (1)~exploit third-party interpretation modules for real-world unstructured components, and (2)~use context-free grammars to convert such poorly or unstructured input to semi-structured output. The basic idea is to enrich context-free grammars with special symbols called detectors, which provide for the necessary structure {em just-in-time to satisfy a parser look-ahead. A prototype implementation has been constructed in the Acoi project to demonstrate the feasibility of this approach for indexing both images and audio documents

    Querying XML Documents Made Easy: Nearest Concept Queries

    Get PDF
    Due to the ubiquity and popularity of XML, users often are in the following situation: they want to query XML documents which contain potentially interesting information but they are unaware of the mark-up structure that is used. For example, it is easy to guess the contents of an XML bibliography file whereas the mark-up depends on the methodological, cultural and personal background of the author(s). Nonetheless, it is this hierarchical structure that forms the basis of XML query languages. In this paper we exploit the tree structure of XML documents to equip users with a powerful tool, the meet operator, that lets them query databases with whose content they are familiar, but without requiring knowledge of tags and hierarchies. Our approach is based on computing the lowest common ancestor of nodes in the XML syntax tree: eg, given two strings, we are looking for nodes whose offspring contains these two strings. The novelty of this approach is that the result type is unknown at query formulation time and dependent on the database instance. If the two strings are an author's name and a year, mainly publications of the author in this year are returned. If the two strings are numbers the result mostly consists of publications that have the numbers as year or page numbers. Because the result type of a query is not specified by the user we refer to the lowest common ancestor as nearest concept We also present a running example taken from the bibliography domain, and demonstrate that the operator can be implemented efficiently

    Flexible and scalable digital library search

    Get PDF
    In this report the development of a specialised search engine for a digital library is described. The proposed system architecture consists of three levels: the conceptual, the logical and the physical level. The conceptual level schema enables by its exposure of a domain specific schema semantically rich conceptual search. The logical level provides a description language to achieve a high degree of flexibility for multimedia retrieval. The physical level takes care of scalable and efficient persistent data storage. The role, played by each level, changes during the various stages of a search engine's lifecycle: (1) modeling the index, (2) populating and maintaining the index and (3) querying the index. The integration of all this functionality allows the combination of both conceptual and content-based querying in the query stage. A search engine for the Australian Open tennis tournament website is used as a running example, which shows the power of the complete architecture and its various component

    Digital Media Warehouses

    No full text

    Querying XML Documents Made Easy: Nearest Concept Queries

    No full text
    Due to the ubiquity and popularity of XML, users often are in the following situation: they want to query XML documents which contain potentially interesting information but they are unaware of the mark-up structure that is used. For example, it is easy to guess the contents of an XML bibliography file whereas the mark-up depends on the methodological, cultural and personal background of the author(s). Nonetheless, it is this hierarchical structure that forms the basis of XML query languages. In this paper we exploit the tree structure of XML documents to equip users with a powerful tool, the meet operator, that lets them query databases with whose content they are familiar, but without requiring knowledge of tags and hierarchies. Our approach is based on computing the lowest common ancestor of nodes in the XML syntax tree: eg, given two strings, we are looking for nodes whose offspring contains these two strings. The novelty of this approach is that the result type is unknown at query formulation time and dependent on the database instance. If the two strings are an author's name and a year, mainly publications of the author in this year are returned. If the two strings are numbers the result mostly consists of publications that have the numbers as year or page numbers. Because the result type of a query is not specified by the user we refer to the lowest common ancestor as nearest concept We also present a running example taken from the bibliography domain, and demonstrate that the operator can be implemented efficiently

    Feature Grammars

    No full text
    We propose a grammatical view of the problem of integrating different data items under a database perspective. We introduce a variant of context-free grammars, called feature grammars, whose parsers may rewrite their input stream. This allows us to provide a simple mechanism for describing and maintaining indexes to Internet multimedia documents. Integration of parser instances as mediators into a database system provides a remarkably transparent framework for indexing external sources and also facilitates the use of plug-in modules provided by third parties. Rewriting the input stream allows to (1) interpret input data and replace them by their interpretations, and (2) integrate data from different sources by linking them into the input stream in the spirit of a structuring schema. The techniques described are used in the Dutch Acoi project

    Efficient Relational Storage and Retrieval of XML Documents (Extended Version)

    No full text
    In this paper, we present a data and an execution model that allow for efficient storage and retrieval of XML documents in a relational database. The data model is strictly based on the notion of binary associations: by decomposing XML documents into small, flexible and semantically homogeneous units we are able to exploit the performance potential of vertical fragmentation. Moreover, our approach provides clear and intuitive semantics, which facilitates the definition of a declarative query algebra. Our experimental results with large collections of XML documents demonstrate the effectiveness of the techniques proposed
    corecore